The High-Dimensional Geometry of Binary Neural Networks
نویسندگان
چکیده
Recent research has shown that one can train a neural network with binary weights and activations at train time by augmenting the weights with a high-precision continuous latent variable that accumulates small changes from stochastic gradient descent. However, there is a dearth of work to explain why one can effectively capture the features in data with binary weights and activations. Our main result is that the neural networks with binary weights and activations trained using the method of Courbariaux, Hubara et al. (2016) work because of the high-dimensional geometry of binary vectors. In particular, the ideal continuous vectors that extract out features in the intermediate representations of these BNNs are well-approximated by binary vectors in the sense that dot products are approximately preserved. Furthermore, the results and analysis used on BNNs are shown to generalize to neural networks with ternary weights and activations. Compared to previous research that demonstrated good classification performance with BNNs, our work explains why these BNNs work in terms of HD geometry. Our theory serves as a foundation for understanding not only BNNs but a variety of methods that seek to compress traditional neural networks. Furthermore, a better understanding of multilayer binary neural networks serves as a starting point for generalizing BNNs to other neural network architectures such as recurrent neural networks.
منابع مشابه
Prediction of true critical temperature and pressure of binary hydrocarbon mixtures: A Comparison between the artificial neural networks and the support vector machine
Two main objectives have been considered in this paper: providing a good model to predict the critical temperature and pressure of binary hydrocarbon mixtures, and comparing the efficiency of the artificial neural network algorithms and the support vector regression as two commonly used soft computing methods. In order to have a fair comparison and to achieve the highest efficiency, a comprehen...
متن کاملNeural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features
This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...
متن کاملArtificial Neural Networks (ANN) for the simultaneous spectrophotometric determination of fluoxetine and sertraline in pharmaceutical formulations and biological fluid
Simultaneous spectrophotometric estimation of Fluoxetine and Sertraline in tablets were performed using UV–Vis spectroscopic and Artificial Neural Networks (ANN). Absorption spectra of two components were recorded in 200–300 (nm) wavelengths region with an interval of 1 nm. The calibration models were thoroughly evaluated at several concentration levels using the spectra of synthetic binary mix...
متن کاملArtificial Neural Networks (ANN) for the simultaneous spectrophotometric determination of fluoxetine and sertraline in pharmaceutical formulations and biological fluid
Simultaneous spectrophotometric estimation of Fluoxetine and Sertraline in tablets were performed using UV–Vis spectroscopic and Artificial Neural Networks (ANN). Absorption spectra of two components were recorded in 200–300 (nm) wavelengths region with an interval of 1 nm. The calibration models were thoroughly evaluated at several concentration levels using the spectra of synthetic binary mix...
متن کاملDelineation of alteration zones based on kriging, artificial neural networks, and concentration–volume fractal modelings in hypogene zone of Miduk porphyry copper deposit, SE Iran
This paper presents a quantitative modeling for delineating alteration zones in the hypogene zone of the Miduk porphyry copper deposit (SE Iran) based on the core drilling data. The main goal of this work was to apply the Ordinary Kriging (OK), Artificial Neural Networks (ANNs), and Concentration-Volume (C-V) fractal modelings on Cu grades to separate different alteration zones. Anisotropy was ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1705.07199 شماره
صفحات -
تاریخ انتشار 2017